PreCog: Improving Crowdsourced Data Quality Before Acquisition

نویسندگان

  • Hamed Nilforoshan
  • Jiannan Wang
  • Eugene Wu
چکیده

Quality control in crowdsourcing systems is crucial. It is typically done after data collection, often using additional crowdsourced tasks to assess and improve the quality. These post-hoc methods can easily add cost and latency to the acquisition process—particularly if collecting high-quality data is important. In this paper, we argue for pre-hoc interface optimizations based on feedback that helps workers improve data quality before it is submitted and is well suited to complement post-hoc techniques. We propose the Precog system that explicitly supports such interface optimizations for common integrity constraints as well as more ambiguous text acquisition tasks where quality is ill-defined. We then develop the Segment-Predict-Explain pattern for detecting low-quality text segments and generating prescriptive explanations to help the worker improve their text input. Our unique combination of segmentation and prescriptive explanation are necessary for Precog to collect 2× more high-quality text data than non-Precog approaches on two real domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Crowdsourced Knowledge Acquisition: Towards Hybrid-Genre Workflows

Novel social media collaboration platforms, such as games with a purpose and mechanised labour marketplaces, are increasingly used for enlisting large populations of non-experts in crowdsourced knowledge acquisition processes. Climate Quiz uses this paradigm for acquiring environmental domain knowledge from non-experts. The game’s usage statistics and the quality of the produced data show that ...

متن کامل

Transforming research results into useful tools for global health: BOOST.

We reported the results of the PRECOG study in the inaugural issue of The Lancet Global Health (July, 2013). Although visual outcomes of cataract surgery have usually been assessed weeks or months after surgery, this study of 4000 patients at 40 hospitals in low-income and middle-income countries (LMICs), where few patients return after operations, demonstrated that assessment of vision the day...

متن کامل

DeScript: A Crowdsourced Corpus for the Acquisition of High-Quality Script Knowledge

Scripts are standardized event sequences describing typical everyday activities, which play an important role in the computational modeling of cognitive abilities (in particular for natural language processing). We present a large-scale crowdsourced collection of explicit linguistic descriptions of script-specific event sequences (40 scenarios with 100 sequences each). The corpus is enriched wi...

متن کامل

Managing Quality of Crowdsourced Data

The Web is the central medium for discovering knowledge via various sources such as blogs, social media, and wikis. It facilitates access to contents provided by a large number of users, regardless of their geographical locations or cultural backgrounds. Such user-generated content is often referred to as crowdsourced data, which provides informational benefit in terms of variety and scale. Yet...

متن کامل

Improving Crowdsourced Live Streaming with Aggregated Edge Networks

Recent years have witnessed a dramatic increase of user-generated video services. In such user-generated video services, crowdsourced live streaming (e.g., Periscope, Twitch) has significantly challenged today’s edge network infrastructure: today’s edge networks (e.g., 4G, Wi-Fi) have limited uplink capacity support, making high-bitrate live streaming over such links fundamentally impossible. I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1704.02384  شماره 

صفحات  -

تاریخ انتشار 2017